Goto

Collaborating Authors

 Pčinja District


Safe Driving in Occluded Environments

Wang, Zhuoyuan, Jia, Tongyao, Rajborirug, Pharuj, Ramesh, Neeraj, Okuda, Hiroyuki, Suzuki, Tatsuya, Kar, Soummya, Nakahira, Yorie

arXiv.org Artificial Intelligence

Abstract--Ensuring safe autonomous driving in the presence of occlusions poses a significant challenge in its policy design. While existing model-driven control techniques based on set invariance can handle visible risks, occlusions create latent risks in which safety-critical states are not observable. Data-driven techniques also struggle to handle latent risks because direct mappings from risk-critical objects in sensor inputs to safe actions cannot be learned without visible risk-critical objects. Motivated by these challenges, in this paper, we propose a probabilistic safety certificate for latent risk. Our key technical enabler is the application of probabilistic invariance: It relaxes the strict observability requirements imposed by set-invariance methods that demand the knowledge of risk-critical states. The proposed techniques provide linear action constraints that confine the latent risk probability within tolerance. Such constraints can be integrated into model predictive controllers or embedded in data-driven policies to mitigate latent risks. The proposed method is tested using the CARLA simulator and compared with a few existing techniques. The theoretical and empirical analysis jointly demonstrate that the proposed methods assure long-term safety in real-time control in occluded environments without being overly conservative and with transparency to exposed risks. ISUAL occlusions impose significant challenges in the policy design of autonomous driving.


STREAM (ChemBio): A Standard for Transparently Reporting Evaluations in AI Model Reports

McCaslin, Tegan, Alaga, Jide, Nedungadi, Samira, Donoughe, Seth, Reed, Tom, Bommasani, Rishi, Painter, Chris, Righetti, Luca

arXiv.org Artificial Intelligence

Evaluations of dangerous AI capabilities are important for managing catastrophic risks. Public transparency into these evaluations - including what they test, how they are conducted, and how their results inform decisions - is crucial for building trust in AI development. We propose STREAM (A Standard for Transparently Reporting Evaluations in AI Model Reports), a standard to improve how model reports disclose evaluation results, initially focusing on chemical and biological (ChemBio) benchmarks. Developed in consultation with 23 experts across government, civil society, academia, and frontier AI companies, this standard is designed to (1) be a practical resource to help AI developers present evaluation results more clearly, and (2) help third parties identify whether model reports provide sufficient detail to assess the rigor of the ChemBio evaluations. We concretely demonstrate our proposed best practices with "gold standard" examples, and also provide a three-page reporting template to enable AI developers to implement our recommendations more easily.


Data Driven Diagnosis for Large Cyber-Physical-Systems with Minimal Prior Information

Steude, Henrik Sebastian, Diedrich, Alexander, Pill, Ingo, Moddemann, Lukas, Vranješ, Daniel, Niggemann, Oliver

arXiv.org Artificial Intelligence

Diagnostic processes for complex cyber-physical systems often require extensive prior knowledge in the form of detailed system models or comprehensive training data. However, obtaining such information poses a significant challenge. To address this issue, we present a new diagnostic approach that operates with minimal prior knowledge, requiring only a basic understanding of subsystem relationships and data from nominal operations. Our method combines a neural network-based symptom generator, which employs subsystem-level anomaly detection, with a new graph diagnosis algorithm that leverages minimal causal relationship information between subsystems-information that is typically available in practice. Our experiments with fully controllable simulated datasets show that our method includes the true causal component in its diagnosis set for 82 p.c. of all cases while effectively reducing the search space in 73 p.c. of the scenarios. Additional tests on the real-world Secure Water Treatment dataset showcase the approach's potential for practical scenarios. Our results thus highlight our approach's potential for practical applications with large and complex cyber-physical systems where limited prior knowledge is available.


Efficient Evaluation of Quantization-Effects in Neural Codecs

Mack, Wolfgang, Mustafa, Ahmed, Łaganowski, Rafał, Hijazy, Samer

arXiv.org Artificial Intelligence

Neural codecs, comprising an encoder, quantizer, and decoder, enable signal transmission at exceptionally low bitrates. Training these systems requires techniques like the straight-through estimator, soft-to-hard annealing, or statistical quantizer emulation to allow a non-zero gradient across the quantizer. Evaluating the effect of quantization in neural codecs, like the influence of gradient passing techniques on the whole system, is often costly and time-consuming due to training demands and the lack of affordable and reliable metrics. This paper proposes an efficient evaluation framework for neural codecs using simulated data with a defined number of bits and low-complexity neural encoders/decoders to emulate the non-linear behavior in larger networks. Our system is highly efficient in terms of training time and computational and hardware requirements, allowing us to uncover distinct behaviors in neural codecs. We propose a modification to stabilize training with the straight-through estimator based on our findings. We validate our findings against an internal neural audio codec and against the state-of-the-art descript-audio-codec.


Artificial Intelligence in Traffic Systems

Saxena, Ritwik Raj

arXiv.org Artificial Intelligence

Existing research on AI-based traffic management systems, utilizing techniques such as fuzzy logic, reinforcement learning, deep neural networks, and evolutionary algorithms, demonstrates the potential of AI to transform the traffic landscape. This article endeavors to review the topics where AI and traffic management intersect. It comprises areas like AI-powered traffic signal control systems, automatic distance and velocity recognition (for instance, in autonomous vehicles, hereafter AVs), smart parking systems, and Intelligent Traffic Management Systems (ITMS), which use data captured in real-time to keep track of traffic conditions, and traffic-related law enforcement and surveillance using AI. AI applications in traffic management cover a wide range of spheres. The spheres comprise, inter alia, streamlining traffic signal timings, predicting traffic bottlenecks in specific areas, detecting potential accidents and road hazards, managing incidents accurately, advancing public transportation systems, development of innovative driver assistance systems, and minimizing environmental impact through simplified routes and reduced emissions. The benefits of AI in traffic management are also diverse. They comprise improved management of traffic data, sounder route decision automation, easier and speedier identification and resolution of vehicular issues through monitoring the condition of individual vehicles, decreased traffic snarls and mishaps, superior resource utilization, alleviated stress of traffic management manpower, greater on-road safety, and better emergency response time.


Generative AI-based Pipeline Architecture for Increasing Training Efficiency in Intelligent Weed Control Systems

Modak, Sourav, Stein, Anthony

arXiv.org Artificial Intelligence

In automated crop protection tasks such as weed control, disease diagnosis, and pest monitoring, deep learning has demonstrated significant potential. However, these advanced models rely heavily on high-quality, diverse datasets, often limited and costly in agricultural settings. Traditional data augmentation can increase dataset volume but usually lacks the real-world variability needed for robust training. This study presents a new approach for generating synthetic images to improve deep learning-based object detection models for intelligent weed control. Our GenAI-based image generation pipeline integrates the Segment Anything Model (SAM) for zero-shot domain adaptation with a text-to-image Stable Diffusion Model, enabling the creation of synthetic images that capture diverse real-world conditions. We evaluate these synthetic datasets using lightweight YOLO models, measuring data efficiency with mAP50 and mAP50-95 scores across varying proportions of real and synthetic data. Notably, YOLO models trained on datasets with 10% synthetic and 90% real images generally demonstrate superior mAP50 and mAP50-95 scores compared to those trained solely on real images. This approach not only reduces dependence on extensive real-world datasets but also enhances predictive performance. The integration of this approach opens opportunities for achieving continual self-improvement of perception modules in intelligent technical systems.


Learning from Naturally Occurring Feedback

Don-Yehiya, Shachar, Choshen, Leshem, Abend, Omri

arXiv.org Artificial Intelligence

Human feedback data is a critical component in developing language models. However, collecting this feedback is costly and ultimately not scalable. We propose a scalable method for extracting feedback that users naturally include when interacting with chat models, and leveraging it for model training. We are further motivated by previous work that showed there are also qualitative advantages to using naturalistic (rather than auto-generated) feedback, such as less hallucinations and biases. We manually annotated conversation data to confirm the presence of naturally occurring feedback in a standard corpus, finding that as much as 30% of the chats include explicit feedback. We apply our method to over 1M conversations to obtain hundreds of thousands of feedback samples. Training with the extracted feedback shows significant performance improvements over baseline models, demonstrating the efficacy of our approach in enhancing model alignment to human preferences.


Design Principles for Falsifiable, Replicable and Reproducible Empirical ML Research

Vranješ, Daniel, Niggemann, Oliver

arXiv.org Artificial Intelligence

Empirical research plays a fundamental role in the machine learning domain. At the heart of impactful empirical research lies the development of clear research hypotheses, which then shape the design of experiments. The execution of experiments must be carried out with precision to ensure reliable results, followed by statistical analysis to interpret these outcomes. This process is key to either supporting or refuting initial hypotheses. Despite its importance, there is a high variability in research practices across the machine learning community and no uniform understanding of quality criteria for empirical research. To address this gap, we propose a model for the empirical research process, accompanied by guidelines to uphold the validity of empirical research. By embracing these recommendations, greater consistency, enhanced reliability and increased impact can be achieved.


Knowledge Guided Semi-Supervised Learning for Quality Assessment of User Generated Videos

Mitra, Shankhanil, Soundararajan, Rajiv

arXiv.org Artificial Intelligence

The deep learning based approaches particularly require training on large amount of labelled data, which is Perceptual quality assessment of user generated content cumbersome and expensive to acquire. This leads to us to (UGC) videos is challenging due to the requirement of the question of how we can design NR VQA models which large scale human annotated videos for training. In this can be trained with very limited labelled training data, yet work, we address this challenge by first designing a selfsupervised achieve excellent generalisation performance on multiple Spatio-Temporal Visual Quality Representation datasets in terms of correlation with human perception. Learning (ST-VQRL) framework to generate robust quality Our focus in this work is on designing semi-supervised aware features for videos. Then, we propose a dual-model NR VQA method with limited labelled along with unlabelled based Semi Supervised Learning (SSL) method specifically data. Since UGC videos have diverse quality characteristics, designed for the Video Quality Assessment (SSL-VQA) task, we believe that pretraining a robust video quality through a novel knowledge transfer of quality predictions feature backbone is extremely important to transfer knowledge between the two models. Our SSL-VQA method uses the during semi-supervised learning. With this motivation, ST-VQRL backbone to produce robust performances across we approach the problem using a combination of various VQA datasets including cross-database settings, contrastive self-supervised pretraining followed by semisupervised despite being learned with limited human annotated videos.


Perceptual Quality Assessment of Face Video Compression: A Benchmark and An Effective Method

Li, Yixuan, Chen, Bolin, Chen, Baoliang, Wang, Meng, Wang, Shiqi, Lin, Weisi

arXiv.org Artificial Intelligence

Recent years have witnessed an exponential increase in the demand for face video compression, and the success of artificial intelligence has expanded the boundaries beyond traditional hybrid video coding. Generative coding approaches have been identified as promising alternatives with reasonable perceptual rate-distortion trade-offs, leveraging the statistical priors of face videos. However, the great diversity of distortion types in spatial and temporal domains, ranging from the traditional hybrid coding frameworks to generative models, present grand challenges in compressed face video quality assessment (VQA) that plays a crucial role in the whole delivery chain for quality monitoring and optimization. In this paper, we introduce the large-scale Compressed Face Video Quality Assessment (CFVQA) database, which is the first attempt to systematically understand the perceptual quality and diversified compression distortions in face videos. The database contains 3,240 compressed face video clips in multiple compression levels, which are derived from 135 source videos with diversified content using six representative video codecs, including two traditional methods based on hybrid coding frameworks, two end-to-end methods, and two generative methods. The unique characteristics of CFVQA, including large-scale, fine-grained, great content diversity, and cross-compression distortion types, make the benchmarking for existing image quality assessment (IQA) and VQA feasible and practical. The results reveal the weakness of existing IQA and VQA models, which challenge real-world face video applications. In addition, a FAce VideO IntegeRity (FA VOR) index for face video compression was developed to measure the perceptual quality, considering the distinct content characteristics and temporal priors of the face videos. Experimental results exhibit its superior performance on the proposed CFVQA dataset. ACE video based services have been growing exponentially, coinciding with the accelerated proliferation of mobile communication and online video content sharing platforms. Face video compression towards human vision, which is indispensable in compressing and delivering gigantic-scale face video data, introduces visual distortions inevitably. During the past decade, advancements in video compression technology have quintessentially benefited face video compression.